Experiences with Achieving Portability across Heterogeneous Architectures
نویسندگان
چکیده
The increasing computational needs of parallel applications inevitably require portability across popular parallel architectures, which are becoming heterogeneous. The lack of a common parallel framework results in divergent code bases, difficulty in porting, higher maintenance cost, and, thus difficulty achieving optimal performance on target architectures. Our paper examines two representative parallel applications and describes code structuring and annotations required to derive a single codebase that is parallelizable across representative heterogeneous architectures, such as multi-core CPU and GPU. Drawing on previous work in the area, we create a universal high-level directive-based framework that supports both of these architectures, and implements execution on each via translation to OpenMP and PGI Accelerator API, respectively. We demonstrate that a high-level framework can support a common codebase that efficiently executes on heterogeneous architectures. Our results show that when combined with a state-of-the-art parallelizing compiler, such framework can yield performance comparable to custom code or a native language. Further, we show that the approach increases programmability, reduces code size and decreases maintenance cost.
منابع مشابه
Trellis: Portability across architectures with a high-level framework
The increasing computational needs of parallel applications inevitably require portability across parallel architectures, which now include heterogeneous processing resources, such as CPUs and GPUs, and multiple SIMD/SIMT widths. However, the lack of a common parallel programming paradigm that provides predictable, near-optimal performance on each resource leads to the use of low-level framewor...
متن کاملArchitecture-aware cost modelling for parallel performance portability
Given the broad availability of increasingly heterogeneous parallel architectures and the conceptual complexity of parallel programming, it is crucial to develop a structured approach to parallel programming that balances expressiveness and portable performance. Since hardware evolves more rapidly than software, performance often suffers due to the lack of adaptation to the target platform, req...
متن کاملEvaluating Performance Portability of OpenACC
Accelerator-based heterogeneous computing is gaining momentum in High Performance Computing arena. However, the increased complexity of the accelerator architectures demands more generic, highlevel programming models. OpenACC is one such attempt to tackle the problem. While the abstraction endowed by OpenACC offers productivity, it raises questions on its portability. This paper evaluates the p...
متن کاملOpenCL Floating Point Software on Heterogeneous Architectures — Portable or Not?
OpenCL is an emerging platform for parallel computing that promises portability of applications across different architectures. This promise is seriously undermined, however, by the frequent use of floating-point arithmetic in scientific applications. Floating-point computations can yield vastly different results on different architectures — even IEEE 754-compliant ones —, potentially causing c...
متن کاملHPVM: A Portable Virtual Instruction Set for Heterogeneous Parallel Systems
We describe a programming abstraction for heterogeneous parallel hardware, designed to capture a wide range of popular parallel hardware, including GPUs, vector instruction sets and multicore CPUs. Our abstraction, which we call HPVM , is a hierarchical dataflow graph with shared memory and vector instructions. We use HPVM to define both a virtual instruction set (ISA) and also a compiler inter...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011